Overview: Information Extraction from Broadcast News
نویسنده
چکیده
Broadcast news is a rich domain for information extraction, but one that presents new challenges for evaluation. In this paper we present an overview of the first evaluation of information extraction from broadcast news that was conducted as part of the DARPA-funded Hub 4 1998 workshop. We discuss the work that was required to design and administer the evaluation, describe some of the challenges that we encountered, and summarize the results of the evaluation.
منابع مشابه
The ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News
This paper gives an overview of the ESTER evaluation campaign. The aim of this campaign is to evaluate automatic broadcast news transcription systems for the French language. The evaluation tasks are divided into three main categories: orthographic transcription, event detection and tracking (e.g. speech vs. music, speaker tracking), and information extraction (e.g. named entity detection, topi...
متن کاملDiscourse Cues for Broadcast News Segmentation
This paper describes the design and application of time-enhanced, finite state models of discourse cues to the automated segmentation of broadcast news. We describe our analysis of a broadcast news corpus, the design of a discourse cue based story segmentor that builds upon information extraction techniques, and finally its computational implementation and evaluation in the Broadcast News Navig...
متن کاملKeyphrase Cloud Generation of Broadcast News
This paper describes an enhanced automatic keyphrase extraction method applied to Broadcast News. The keyphrase extraction process is used to create a concept level for each news. On top of words resulting from a speech recognition system output and news indexation and it contributes to the generation of a tag/keyphrase cloud of the top news included in a Multimedia Monitoring Solution system f...
متن کاملInformation Extraction from Broadcast News
This paper discusses the development of trainable statistical models for extracting content from television and radio news broadcasts. In particular we concentrate on statistical finite state models for identifying proper names and other named entities in broadcast speech. Two models are presented: the first represents name class information as a word attribute; the second represents both word-...
متن کاملTopic extraction with multiple topic-words in broadcast-news speech
This paper reports on topic extraction in Japanese broadcastnews speech. We studied, using continuous speech recognition, the extraction of several topic-words from broadcast-news. A combination of multiple topic-words represents the content of the news. This is a more detailed and more flexible approach than using a single word or a single category. A topic-extraction model shows the degree of...
متن کامل